m-tables: Representing Missing Data
نویسندگان
چکیده
Representation systems have been widely used to capture different forms of incomplete data in various settings. However, existing representation systems are not expressive enough to handle the more complex scenarios of missing data that can occur in practice: these could vary from missing attribute values, missing a known number of tuples, or even missing an unknown number of tuples. In this work, we propose a new representation system called m-tables, that can represent many different types of missing data. We show that m-tables form a closed, complete and strong representation system under both set and bag semantics and are strictly more expressive than conditional tables under both the closed and open world assumptions. We further study the complexity of computing certain and possible answers in m-tables. Finally, we discuss how to “interpret” m-tables through a novel labeling scheme that marks a type of generalized tuples as certain or possible. 1998 ACM Subject Classification H.2.4 [Database Management] Systems
منابع مشابه
Impute Missing Assessments by Opinion Clustering in Multi-Criteria Group Decision Making Problems
Multi-criteria group decision-making and evaluation (MCGDME) method typically aggregates information in evaluation tables. For various reasons, evaluation tables (decision matrix) often include missing data that highly affect correct decision-making and evaluation. Most existing imputation methods of missing data are based on statistical features which do not exist in an MCGDME setting. This pa...
متن کاملA Visualization Tool for Mining Large Correlation Tables: The Association Navigator
The Association Navigator is an interactive visualization tool for viewing large tables of correlations. The basic operation is zooming and panning of a table that is presented in graphical form, here called a “blockplot”. The tool is really a tool box that includes, among other things: (1) display of p-values and missing value patterns in addition to correlations, (2) mark-up facilities to hig...
متن کاملRobust Singular Value Decomposition
The singular value decomposition of a rectangular data matrix can be used to understand the structure of the data and give insight into the relationships of the row and column factors. For example, the rows linked to the rows might be experimental conditions of temperature and the experimental conditions linked to the columns might pressure. In a biological setting the rows might be linked to t...
متن کاملRough Set Approaches to Rule Induction from Incomplete Data
In this paper we assume that data are presented in the form of decision tables, incomplete when some attribute values are missing. Two main cases of missing attribute values are considered: lost (the original value was erased) and "do not care" conditions (the original value was irrelevant). This paper uses, as the main tool, attribute-value pair blocks. These blocks are used to construct chara...
متن کاملRough Set Strategies to Data with Missing Attribute Values
In this paper we assume that a data set is presented in the form of the incompletely specified decision table, i.e., some attribute values are missing. Our next basic assumption is that some of the missing attribute values are lost (e.g., erased) and some are "do not care" conditions (i.e., they were redundant or not necessary to make a decision or to classify a case). Incompletely specified de...
متن کامل